Lossless Data Compression at Finite Blocklengths
نویسندگان
چکیده
This paper provides an extensive study of the behavior of the best achievable rate (and other related fundamental limits) in variable-length lossless compression. In the non-asymptotic regime, the fundamental limits of fixed-to-variable lossless compression with and without prefix constraints are shown to be tightly coupled. Several precise, quantitative bounds are derived, connecting the distribution of the optimal codelengths to the source information spectrum, and an exact analysis of the best achievable rate for arbitrary sources is given. Fine asymptotic results are proved for arbitrary (not necessarily prefix) compressors on general mixing sources. Non-asymptotic, explicit Gaussian approximation bounds are established for the best achievable rate on Markov sources. The source dispersion and the source varentropy rate are defined and characterized. Together with the entropy rate, the varentropy rate serves to tightly approximate the fundamental non-asymptotic limits of fixed-to-variable compression for all but very small blocklengths.
منابع مشابه
Grammar-based codes: A new class of universal lossless source codes
We investigate a type of lossless source code called a grammar-based code, which, in response to any input data string over a fixed finite alphabet, selects a context-free grammar representing in the sense that is the unique string belonging to the language generated by . Lossless compression of takes place indirectly via compression of the production rules of the grammar . It is shown that, su...
متن کاملUniversal lossless compression via multilevel pattern matching
A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length , the MPM code operates at (log log ) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching pat...
متن کاملLossless Microarray Image Compression by Hardware Array Compactor
Microarray technology is a new and powerful tool for concurrent monitoring of large number of genes expressions. Each microarray experiment produces hundreds of images. Each digital image requires a large storage space. Hence, real-time processing of these images and transmission of them necessitates efficient and custom-made lossless compression schemes. In this paper, we offer a new archi...
متن کاملStudy On Universal Lossless Data Compression by using Context Dependence Multilevel Pattern Matching Grammar Transform
In this paper, the context dependence multilevel pattern matching(in short CDMPM) grammar transform is proposed; based on this grammar transform, the universal lossless data compression algorithm, CDMPM code is then developed. Moreover, it is proved that this algorithms’ worst case redundancy among all individual sequences of length n from a finite alphabet is upper bounded by ) log / 1 ( n C w...
متن کاملParameterized Markov Models for Efficient Compression of Grayscale Images
Introduction. Lossless compression using finite context variable order Markov models generally achieves smaller compression ratios for small and medium sized images than for text data of the same size. This is due to 1) the larger alphabet size, 2) the enormous number of different contexts especially in higher order models, and 3) quantization noise introduced during the digitization process. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1212.2668 شماره
صفحات -
تاریخ انتشار 2012